Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 2106 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 214.0 KiB |
| Average record size in memory | 104.1 B |
Variable types
| Categorical | 3 |
|---|---|
| Numeric | 10 |
datum has a high cardinality: 2106 distinct values | High cardinality |
R06 is highly correlated with Month | High correlation |
Month is highly correlated with R06 | High correlation |
datum is uniformly distributed | Uniform |
Weekday Name is uniformly distributed | Uniform |
datum has unique values | Unique |
M01AB has 40 (1.9%) zeros | Zeros |
M01AE has 36 (1.7%) zeros | Zeros |
N02BA has 78 (3.7%) zeros | Zeros |
N02BE has 26 (1.2%) zeros | Zeros |
N05B has 43 (2.0%) zeros | Zeros |
N05C has 1430 (67.9%) zeros | Zeros |
R03 has 484 (23.0%) zeros | Zeros |
R06 has 256 (12.2%) zeros | Zeros |
Reproduction
| Analysis started | 2022-02-23 07:02:11.664788 |
|---|---|
| Analysis finished | 2022-02-23 07:02:37.356078 |
| Duration | 25.69 seconds |
| Software version | pandas-profiling v3.1.1 |
| Download configuration | config.json |
| Distinct | 2106 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 16.6 KiB |
| 1/2/2014 | 1 |
|---|---|
| 11/16/2017 | 1 |
| 11/14/2017 | 1 |
| 11/13/2017 | 1 |
| 11/12/2017 | 1 |
| Other values (2101) |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.924026591 |
| Min length | 8 |
Characters and Unicode
| Total characters | 18794 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 2106 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 1/2/2014 |
|---|---|
| 2nd row | 1/3/2014 |
| 3rd row | 1/4/2014 |
| 4th row | 1/5/2014 |
| 5th row | 1/6/2014 |
Common Values
| Value | Count | Frequency (%) |
| 1/2/2014 | 1 | < 0.1% |
| 11/16/2017 | 1 | < 0.1% |
| 11/14/2017 | 1 | < 0.1% |
| 11/13/2017 | 1 | < 0.1% |
| 11/12/2017 | 1 | < 0.1% |
| 11/11/2017 | 1 | < 0.1% |
| 11/10/2017 | 1 | < 0.1% |
| 11/9/2017 | 1 | < 0.1% |
| 11/8/2017 | 1 | < 0.1% |
| 11/7/2017 | 1 | < 0.1% |
| Other values (2096) | 2096 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 1/2/2014 | 1 | < 0.1% |
| 1/4/2014 | 1 | < 0.1% |
| 1/6/2014 | 1 | < 0.1% |
| 1/7/2014 | 1 | < 0.1% |
| 1/8/2014 | 1 | < 0.1% |
| 1/9/2014 | 1 | < 0.1% |
| 1/10/2014 | 1 | < 0.1% |
| 1/11/2014 | 1 | < 0.1% |
| 1/12/2014 | 1 | < 0.1% |
| 1/13/2014 | 1 | < 0.1% |
| Other values (2096) | 2096 |
Most occurring characters
| Value | Count | Frequency (%) |
| / | 4212 | |
| 1 | 3846 | |
| 2 | 3323 | |
| 0 | 2470 | |
| 7 | 759 | 4.0% |
| 8 | 759 | 4.0% |
| 5 | 759 | 4.0% |
| 6 | 754 | 4.0% |
| 4 | 752 | 4.0% |
| 9 | 663 | 3.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 14582 | |
| Other Punctuation | 4212 | 22.4% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 3846 | |
| 2 | 3323 | |
| 0 | 2470 | |
| 7 | 759 | 5.2% |
| 8 | 759 | 5.2% |
| 5 | 759 | 5.2% |
| 6 | 754 | 5.2% |
| 4 | 752 | 5.2% |
| 9 | 663 | 4.5% |
| 3 | 497 | 3.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 4212 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 18794 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| / | 4212 | |
| 1 | 3846 | |
| 2 | 3323 | |
| 0 | 2470 | |
| 7 | 759 | 4.0% |
| 8 | 759 | 4.0% |
| 5 | 759 | 4.0% |
| 6 | 754 | 4.0% |
| 4 | 752 | 4.0% |
| 9 | 663 | 3.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 18794 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| / | 4212 | |
| 1 | 3846 | |
| 2 | 3323 | |
| 0 | 2470 | |
| 7 | 759 | 4.0% |
| 8 | 759 | 4.0% |
| 5 | 759 | 4.0% |
| 6 | 754 | 4.0% |
| 4 | 752 | 4.0% |
| 9 | 663 | 3.5% |
| Distinct | 218 |
|---|---|
| Distinct (%) | 10.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.033683325 |
| Minimum | 0 |
|---|---|
| Maximum | 17.34 |
| Zeros | 40 |
| Zeros (%) | 1.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 4.99 |
| Q3 | 6.67 |
| 95-th percentile | 10 |
| Maximum | 17.34 |
| Range | 17.34 |
| Interquartile range (IQR) | 3.67 |
Descriptive statistics
| Standard deviation | 2.737578507 |
|---|---|
| Coefficient of variation (CV) | 0.543851953 |
| Kurtosis | 0.5900347612 |
| Mean | 5.033683325 |
| Median Absolute Deviation (MAD) | 1.99 |
| Skewness | 0.6423125147 |
| Sum | 10600.93708 |
| Variance | 7.494336084 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4 | 126 | 6.0% |
| 5 | 119 | 5.7% |
| 3 | 111 | 5.3% |
| 6 | 99 | 4.7% |
| 2 | 99 | 4.7% |
| 1 | 63 | 3.0% |
| 7 | 60 | 2.8% |
| 5.33 | 58 | 2.8% |
| 3.33 | 57 | 2.7% |
| 2.33 | 55 | 2.6% |
| Other values (208) | 1259 |
| Value | Count | Frequency (%) |
| 0 | 40 | |
| 0.2125 | 1 | < 0.1% |
| 0.33 | 6 | 0.3% |
| 0.34 | 10 | 0.5% |
| 0.66 | 2 | 0.1% |
| 0.67 | 5 | 0.2% |
| 0.68 | 2 | 0.1% |
| 0.83 | 1 | < 0.1% |
| 1 | 63 | |
| 1.18 | 2 | 0.1% |
| Value | Count | Frequency (%) |
| 17.34 | 1 | |
| 17 | 1 | |
| 16.68 | 1 | |
| 16.18 | 1 | |
| 15.33 | 1 | |
| 14.66 | 1 | |
| 14.33 | 1 | |
| 14.18 | 1 | |
| 14.01 | 1 | |
| 14 | 1 |
| Distinct | 694 |
|---|---|
| Distinct (%) | 33.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.895830316 |
| Minimum | 0 |
|---|---|
| Maximum | 14.463 |
| Zeros | 36 |
| Zeros (%) | 1.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.81625 |
| Q1 | 2.34 |
| median | 3.67 |
| Q3 | 5.138 |
| 95-th percentile | 7.66 |
| Maximum | 14.463 |
| Range | 14.463 |
| Interquartile range (IQR) | 2.798 |
Descriptive statistics
| Standard deviation | 2.133336599 |
|---|---|
| Coefficient of variation (CV) | 0.5475948453 |
| Kurtosis | 0.8918244314 |
| Mean | 3.895830316 |
| Median Absolute Deviation (MAD) | 1.34 |
| Skewness | 0.7183773911 |
| Sum | 8204.618646 |
| Variance | 4.551125046 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3 | 55 | 2.6% |
| 2 | 53 | 2.5% |
| 3.34 | 52 | 2.5% |
| 4 | 47 | 2.2% |
| 2.34 | 45 | 2.1% |
| 1 | 39 | 1.9% |
| 2.33 | 38 | 1.8% |
| 0 | 36 | 1.7% |
| 3.33 | 34 | 1.6% |
| 1.34 | 34 | 1.6% |
| Other values (684) | 1673 |
| Value | Count | Frequency (%) |
| 0 | 36 | |
| 0.033 | 1 | < 0.1% |
| 0.066 | 3 | 0.1% |
| 0.157 | 1 | < 0.1% |
| 0.198 | 1 | < 0.1% |
| 0.231 | 1 | < 0.1% |
| 0.33 | 10 | 0.5% |
| 0.34 | 12 | 0.6% |
| 0.363 | 2 | 0.1% |
| 0.373 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 14.463 | 1 | |
| 13.34 | 1 | |
| 12.706 | 1 | |
| 11.99 | 1 | |
| 11.745 | 1 | |
| 11.69 | 1 | |
| 11.505 | 1 | |
| 11.32 | 1 | |
| 11.31 | 1 | |
| 10.76 | 1 |
| Distinct | 199 |
|---|---|
| Distinct (%) | 9.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.880441121 |
| Minimum | 0 |
|---|---|
| Maximum | 16 |
| Zeros | 78 |
| Zeros (%) | 3.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.6 |
| Q1 | 2 |
| median | 3.5 |
| Q3 | 5.2 |
| 95-th percentile | 8 |
| Maximum | 16 |
| Range | 16 |
| Interquartile range (IQR) | 3.2 |
Descriptive statistics
| Standard deviation | 2.384010237 |
|---|---|
| Coefficient of variation (CV) | 0.6143657802 |
| Kurtosis | 1.187551278 |
| Mean | 3.880441121 |
| Median Absolute Deviation (MAD) | 1.5 |
| Skewness | 0.8494966517 |
| Sum | 8172.209 |
| Variance | 5.683504808 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3 | 217 | 10.3% |
| 2 | 212 | 10.1% |
| 4 | 178 | 8.5% |
| 1 | 147 | 7.0% |
| 5 | 143 | 6.8% |
| 6 | 104 | 4.9% |
| 0 | 78 | 3.7% |
| 7 | 63 | 3.0% |
| 8 | 31 | 1.5% |
| 3.5 | 30 | 1.4% |
| Other values (189) | 903 |
| Value | Count | Frequency (%) |
| 0 | 78 | |
| 0.1 | 3 | 0.1% |
| 0.15 | 2 | 0.1% |
| 0.2 | 8 | 0.4% |
| 0.25 | 6 | 0.3% |
| 0.3 | 2 | 0.1% |
| 0.4 | 1 | < 0.1% |
| 0.416666667 | 1 | < 0.1% |
| 0.45 | 1 | < 0.1% |
| 0.5 | 3 | 0.1% |
| Value | Count | Frequency (%) |
| 16 | 1 | |
| 15 | 1 | |
| 14.4 | 1 | |
| 14 | 1 | |
| 13.7 | 1 | |
| 13.3 | 1 | |
| 13.29166667 | 1 | |
| 12.7 | 1 | |
| 12.5 | 1 | |
| 12.3 | 1 |
| Distinct | 713 |
|---|---|
| Distinct (%) | 33.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29.9170953 |
| Minimum | 0 |
|---|---|
| Maximum | 161 |
| Zeros | 26 |
| Zeros (%) | 1.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10.2 |
| Q1 | 19 |
| median | 26.9 |
| Q3 | 38.3 |
| 95-th percentile | 59.9375 |
| Maximum | 161 |
| Range | 161 |
| Interquartile range (IQR) | 19.3 |
Descriptive statistics
| Standard deviation | 15.59096554 |
|---|---|
| Coefficient of variation (CV) | 0.5211390137 |
| Kurtosis | 3.418982129 |
| Mean | 29.9170953 |
| Median Absolute Deviation (MAD) | 9.3 |
| Skewness | 1.202291302 |
| Sum | 63005.40271 |
| Variance | 243.0782065 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 15 | 37 | 1.8% |
| 19 | 36 | 1.7% |
| 20 | 34 | 1.6% |
| 18 | 34 | 1.6% |
| 24 | 34 | 1.6% |
| 22 | 34 | 1.6% |
| 14 | 32 | 1.5% |
| 23 | 30 | 1.4% |
| 16 | 27 | 1.3% |
| 21 | 26 | 1.2% |
| Other values (703) | 1782 |
| Value | Count | Frequency (%) |
| 0 | 26 | |
| 1.2 | 1 | < 0.1% |
| 2 | 2 | 0.1% |
| 3 | 1 | < 0.1% |
| 6 | 3 | 0.1% |
| 6.2 | 1 | < 0.1% |
| 6.5 | 1 | < 0.1% |
| 6.6 | 2 | 0.1% |
| 6.7 | 1 | < 0.1% |
| 6.8 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 161 | 1 | |
| 108.7 | 1 | |
| 100.1 | 1 | |
| 97.8 | 1 | |
| 93.05 | 1 | |
| 89.3 | 1 | |
| 88.4 | 1 | |
| 88.3 | 1 | |
| 88.2 | 1 | |
| 87.2 | 1 |
| Distinct | 77 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.853626543 |
| Minimum | 0 |
|---|---|
| Maximum | 54.83333333 |
| Zeros | 43 |
| Zeros (%) | 2.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 8 |
| Q3 | 12 |
| 95-th percentile | 19 |
| Maximum | 54.83333333 |
| Range | 54.83333333 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 5.605604747 |
|---|---|
| Coefficient of variation (CV) | 0.633142218 |
| Kurtosis | 4.022887927 |
| Mean | 8.853626543 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 1.324840405 |
| Sum | 18645.7375 |
| Variance | 31.42280458 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 6 | 167 | 7.9% |
| 7 | 166 | 7.9% |
| 5 | 160 | 7.6% |
| 4 | 159 | 7.5% |
| 9 | 154 | 7.3% |
| 8 | 151 | 7.2% |
| 3 | 133 | 6.3% |
| 11 | 130 | 6.2% |
| 10 | 118 | 5.6% |
| 12 | 102 | 4.8% |
| Other values (67) | 666 |
| Value | Count | Frequency (%) |
| 0 | 43 | 2.0% |
| 1 | 49 | 2.3% |
| 2 | 89 | |
| 2.6 | 1 | < 0.1% |
| 3 | 133 | |
| 3.5 | 3 | 0.1% |
| 4 | 159 | |
| 5 | 160 | |
| 5.5 | 1 | < 0.1% |
| 6 | 167 |
| Value | Count | Frequency (%) |
| 54.83333333 | 1 | < 0.1% |
| 43 | 1 | < 0.1% |
| 36 | 1 | < 0.1% |
| 33 | 3 | |
| 32 | 2 | |
| 31 | 4 | |
| 30 | 1 | < 0.1% |
| 29 | 2 | |
| 28.33333333 | 1 | < 0.1% |
| 28 | 2 |
| Distinct | 20 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5935224755 |
| Minimum | 0 |
|---|---|
| Maximum | 9 |
| Zeros | 1430 |
| Zeros (%) | 67.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.09298832 |
|---|---|
| Coefficient of variation (CV) | 1.841528106 |
| Kurtosis | 8.763071582 |
| Mean | 0.5935224755 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.520467859 |
| Sum | 1249.958333 |
| Variance | 1.194623468 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=20)
| Value | Count | Frequency (%) |
| 0 | 1430 | |
| 1 | 334 | 15.9% |
| 2 | 157 | 7.5% |
| 3 | 113 | 5.4% |
| 4 | 23 | 1.1% |
| 5 | 12 | 0.6% |
| 0.833333333 | 5 | 0.2% |
| 1.25 | 5 | 0.2% |
| 6 | 4 | 0.2% |
| 0.416666667 | 4 | 0.2% |
| Other values (10) | 19 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 1430 | |
| 0.416666667 | 4 | 0.2% |
| 0.625 | 2 | 0.1% |
| 0.833333333 | 5 | 0.2% |
| 1 | 334 | 15.9% |
| 1.25 | 5 | 0.2% |
| 1.666666667 | 2 | 0.1% |
| 1.875 | 1 | < 0.1% |
| 2 | 157 | 7.5% |
| 2.083333333 | 4 | 0.2% |
| Value | Count | Frequency (%) |
| 9 | 2 | 0.1% |
| 8 | 2 | 0.1% |
| 7 | 2 | 0.1% |
| 6 | 4 | 0.2% |
| 5 | 12 | 0.6% |
| 4.166666667 | 1 | < 0.1% |
| 4 | 23 | 1.1% |
| 3 | 113 | |
| 2.916666667 | 1 | < 0.1% |
| 2.5 | 2 | 0.1% |
| Distinct | 64 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.512261594 |
| Minimum | 0 |
|---|---|
| Maximum | 45 |
| Zeros | 484 |
| Zeros (%) | 23.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 4 |
| Q3 | 8 |
| 95-th percentile | 20 |
| Maximum | 45 |
| Range | 45 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 6.428736372 |
|---|---|
| Coefficient of variation (CV) | 1.166261119 |
| Kurtosis | 4.015505418 |
| Mean | 5.512261594 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 1.829746242 |
| Sum | 11608.82292 |
| Variance | 41.32865135 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 484 | |
| 1 | 257 | |
| 5 | 188 | 8.9% |
| 2 | 173 | 8.2% |
| 6 | 146 | 6.9% |
| 3 | 119 | 5.7% |
| 7 | 93 | 4.4% |
| 10 | 84 | 4.0% |
| 4 | 73 | 3.5% |
| 8 | 63 | 3.0% |
| Other values (54) | 426 |
| Value | Count | Frequency (%) |
| 0 | 484 | |
| 0.416666667 | 1 | < 0.1% |
| 1 | 257 | |
| 1.25 | 1 | < 0.1% |
| 1.416666667 | 1 | < 0.1% |
| 1.666666667 | 3 | 0.1% |
| 1.875 | 1 | < 0.1% |
| 2 | 173 | 8.2% |
| 2.5 | 3 | 0.1% |
| 2.916666667 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 45 | 1 | < 0.1% |
| 41 | 1 | < 0.1% |
| 37 | 1 | < 0.1% |
| 36 | 1 | < 0.1% |
| 35 | 1 | < 0.1% |
| 34 | 2 | 0.1% |
| 33 | 2 | 0.1% |
| 31 | 6 | |
| 30 | 3 | |
| 29 | 4 |
| Distinct | 98 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.900198243 |
| Minimum | 0 |
|---|---|
| Maximum | 15 |
| Zeros | 256 |
| Zeros (%) | 12.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 8 |
| Maximum | 15 |
| Range | 15 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.415815905 |
|---|---|
| Coefficient of variation (CV) | 0.8329830247 |
| Kurtosis | 1.989410302 |
| Mean | 2.900198243 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.292813236 |
| Sum | 6107.8175 |
| Variance | 5.836166485 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2 | 392 | |
| 1 | 351 | |
| 3 | 295 | |
| 0 | 256 | |
| 4 | 173 | |
| 5 | 143 | 6.8% |
| 6 | 84 | 4.0% |
| 7 | 49 | 2.3% |
| 8 | 26 | 1.2% |
| 10 | 24 | 1.1% |
| Other values (88) | 313 |
| Value | Count | Frequency (%) |
| 0 | 256 | |
| 0.1 | 3 | 0.1% |
| 0.2 | 4 | 0.2% |
| 0.3 | 1 | < 0.1% |
| 0.33 | 1 | < 0.1% |
| 0.34 | 1 | < 0.1% |
| 0.4 | 4 | 0.2% |
| 0.5 | 2 | 0.1% |
| 0.6 | 1 | < 0.1% |
| 0.625 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 15 | 2 | 0.1% |
| 13.5 | 1 | < 0.1% |
| 12.4 | 1 | < 0.1% |
| 12.2 | 1 | < 0.1% |
| 12.1 | 1 | < 0.1% |
| 12 | 6 | 0.3% |
| 11 | 9 | 0.4% |
| 10.5 | 4 | 0.2% |
| 10.4 | 1 | < 0.1% |
| 10 | 24 |
Year
Real number (ℝ≥0)
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2016.401235 |
| Minimum | 2014 |
|---|---|
| Maximum | 2019 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 2014 |
|---|---|
| 5-th percentile | 2014 |
| Q1 | 2015 |
| median | 2016 |
| Q3 | 2018 |
| 95-th percentile | 2019 |
| Maximum | 2019 |
| Range | 5 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.665060368 |
|---|---|
| Coefficient of variation (CV) | 0.0008257584549 |
| Kurtosis | -1.221280398 |
| Mean | 2016.401235 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.04472617313 |
| Sum | 4246541 |
| Variance | 2.772426029 |
| Monotonicity | Increasing |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) |
| 2016 | 366 | |
| 2015 | 365 | |
| 2017 | 365 | |
| 2018 | 365 | |
| 2014 | 364 | |
| 2019 | 281 |
| Value | Count | Frequency (%) |
| 2014 | 364 | |
| 2015 | 365 | |
| 2016 | 366 | |
| 2017 | 365 | |
| 2018 | 365 | |
| 2019 | 281 |
| Value | Count | Frequency (%) |
| 2019 | 281 | |
| 2018 | 365 | |
| 2017 | 365 | |
| 2016 | 366 | |
| 2015 | 365 | |
| 2014 | 364 |
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.344254511 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 16.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 9 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.386953836 |
|---|---|
| Coefficient of variation (CV) | 0.5338615955 |
| Kurtosis | -1.161574205 |
| Mean | 6.344254511 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.04384499493 |
| Sum | 13361 |
| Variance | 11.47145628 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=12)
| Value | Count | Frequency (%) |
| 3 | 186 | |
| 5 | 186 | |
| 7 | 186 | |
| 8 | 186 | |
| 1 | 185 | |
| 4 | 180 | |
| 6 | 180 | |
| 9 | 180 | |
| 2 | 169 | |
| 10 | 163 | |
| Other values (2) | 305 |
| Value | Count | Frequency (%) |
| 1 | 185 | |
| 2 | 169 | |
| 3 | 186 | |
| 4 | 180 | |
| 5 | 186 | |
| 6 | 180 | |
| 7 | 186 | |
| 8 | 186 | |
| 9 | 180 | |
| 10 | 163 |
| Value | Count | Frequency (%) |
| 12 | 155 | |
| 11 | 150 | |
| 10 | 163 | |
| 9 | 180 | |
| 8 | 186 | |
| 7 | 186 | |
| 6 | 180 | |
| 5 | 186 | |
| 4 | 180 | |
| 3 | 186 |
Hour
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 16.6 KiB |
| 276 | |
|---|---|
| 248 | 1 |
| 190 | 1 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 6318 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 248 |
|---|---|
| 2nd row | 276 |
| 3rd row | 276 |
| 4th row | 276 |
| 5th row | 276 |
Common Values
| Value | Count | Frequency (%) |
| 276 | 2104 | |
| 248 | 1 | < 0.1% |
| 190 | 1 | < 0.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 276 | 2104 | |
| 248 | 1 | < 0.1% |
| 190 | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 2105 | |
| 7 | 2104 | |
| 6 | 2104 | |
| 4 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6318 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 2105 | |
| 7 | 2104 | |
| 6 | 2104 | |
| 4 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6318 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 2105 | |
| 7 | 2104 | |
| 6 | 2104 | |
| 4 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6318 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 2105 | |
| 7 | 2104 | |
| 6 | 2104 | |
| 4 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 16.6 KiB |
| Thursday | |
|---|---|
| Friday | |
| Saturday | |
| Sunday | |
| Monday | |
| Other values (2) |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 7.141975309 |
| Min length | 6 |
Characters and Unicode
| Total characters | 15041 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Thursday |
|---|---|
| 2nd row | Friday |
| 3rd row | Saturday |
| 4th row | Sunday |
| 5th row | Monday |
Common Values
| Value | Count | Frequency (%) |
| Thursday | 301 | |
| Friday | 301 | |
| Saturday | 301 | |
| Sunday | 301 | |
| Monday | 301 | |
| Tuesday | 301 | |
| Wednesday | 300 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| thursday | 301 | |
| friday | 301 | |
| saturday | 301 | |
| sunday | 301 | |
| monday | 301 | |
| tuesday | 301 | |
| wednesday | 300 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 2407 | |
| d | 2406 | |
| y | 2106 | |
| u | 1204 | |
| r | 903 | 6.0% |
| n | 902 | 6.0% |
| s | 902 | 6.0% |
| e | 901 | 6.0% |
| T | 602 | 4.0% |
| S | 602 | 4.0% |
| Other values (7) | 2106 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 12935 | |
| Uppercase Letter | 2106 | 14.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2407 | |
| d | 2406 | |
| y | 2106 | |
| u | 1204 | |
| r | 903 | 7.0% |
| n | 902 | 7.0% |
| s | 902 | 7.0% |
| e | 901 | 7.0% |
| o | 301 | 2.3% |
| t | 301 | 2.3% |
| Other values (2) | 602 | 4.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 602 | |
| S | 602 | |
| M | 301 | |
| F | 301 | |
| W | 300 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 15041 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 2407 | |
| d | 2406 | |
| y | 2106 | |
| u | 1204 | |
| r | 903 | 6.0% |
| n | 902 | 6.0% |
| s | 902 | 6.0% |
| e | 901 | 6.0% |
| T | 602 | 4.0% |
| S | 602 | 4.0% |
| Other values (7) | 2106 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15041 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 2407 | |
| d | 2406 | |
| y | 2106 | |
| u | 1204 | |
| r | 903 | 6.0% |
| n | 902 | 6.0% |
| s | 902 | 6.0% |
| e | 901 | 6.0% |
| T | 602 | 4.0% |
| S | 602 | 4.0% |
| Other values (7) | 2106 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| datum | M01AB | M01AE | N02BA | N02BE | N05B | N05C | R03 | R06 | Year | Month | Hour | Weekday Name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1/2/2014 | 0.00 | 3.67 | 3.4 | 32.40 | 7.0 | 0.0 | 0.0 | 2.0 | 2014 | 1 | 248 | Thursday |
| 1 | 1/3/2014 | 8.00 | 4.00 | 4.4 | 50.60 | 16.0 | 0.0 | 20.0 | 4.0 | 2014 | 1 | 276 | Friday |
| 2 | 1/4/2014 | 2.00 | 1.00 | 6.5 | 61.85 | 10.0 | 0.0 | 9.0 | 1.0 | 2014 | 1 | 276 | Saturday |
| 3 | 1/5/2014 | 4.00 | 3.00 | 7.0 | 41.10 | 8.0 | 0.0 | 3.0 | 0.0 | 2014 | 1 | 276 | Sunday |
| 4 | 1/6/2014 | 5.00 | 1.00 | 4.5 | 21.70 | 16.0 | 2.0 | 6.0 | 2.0 | 2014 | 1 | 276 | Monday |
| 5 | 1/7/2014 | 0.00 | 0.00 | 0.0 | 0.00 | 0.0 | 0.0 | 0.0 | 0.0 | 2014 | 1 | 276 | Tuesday |
| 6 | 1/8/2014 | 5.33 | 3.00 | 10.5 | 26.40 | 19.0 | 1.0 | 10.0 | 0.0 | 2014 | 1 | 276 | Wednesday |
| 7 | 1/9/2014 | 7.00 | 1.68 | 8.0 | 25.00 | 16.0 | 0.0 | 3.0 | 2.0 | 2014 | 1 | 276 | Thursday |
| 8 | 1/10/2014 | 5.00 | 2.00 | 2.0 | 53.30 | 15.0 | 2.0 | 0.0 | 2.0 | 2014 | 1 | 276 | Friday |
| 9 | 1/11/2014 | 5.00 | 4.34 | 10.4 | 52.30 | 14.0 | 0.0 | 1.0 | 0.2 | 2014 | 1 | 276 | Saturday |
Last rows
| datum | M01AB | M01AE | N02BA | N02BE | N05B | N05C | R03 | R06 | Year | Month | Hour | Weekday Name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2096 | 9/29/2019 | 3.51 | 3.867 | 3.00 | 67.80 | 6.0 | 0.0 | 3.0 | 2.10 | 2019 | 9 | 276 | Sunday |
| 2097 | 9/30/2019 | 2.00 | 1.439 | 2.10 | 49.40 | 9.0 | 0.0 | 5.0 | 2.00 | 2019 | 9 | 276 | Monday |
| 2098 | 10/1/2019 | 11.34 | 2.406 | 0.10 | 47.00 | 15.0 | 4.0 | 17.0 | 1.50 | 2019 | 10 | 276 | Tuesday |
| 2099 | 10/2/2019 | 5.18 | 3.274 | 2.80 | 30.20 | 9.0 | 1.0 | 0.0 | 1.10 | 2019 | 10 | 276 | Wednesday |
| 2100 | 10/3/2019 | 5.00 | 3.000 | 4.00 | 40.40 | 10.0 | 0.0 | 2.0 | 2.00 | 2019 | 10 | 276 | Thursday |
| 2101 | 10/4/2019 | 7.34 | 5.683 | 2.25 | 22.45 | 13.0 | 0.0 | 1.0 | 1.00 | 2019 | 10 | 276 | Friday |
| 2102 | 10/5/2019 | 3.84 | 5.010 | 6.00 | 25.40 | 7.0 | 0.0 | 0.0 | 0.33 | 2019 | 10 | 276 | Saturday |
| 2103 | 10/6/2019 | 4.00 | 11.690 | 2.00 | 34.60 | 6.0 | 0.0 | 5.0 | 4.20 | 2019 | 10 | 276 | Sunday |
| 2104 | 10/7/2019 | 7.34 | 4.507 | 3.00 | 50.80 | 6.0 | 0.0 | 10.0 | 1.00 | 2019 | 10 | 276 | Monday |
| 2105 | 10/8/2019 | 0.33 | 1.730 | 0.50 | 44.30 | 20.0 | 2.0 | 2.0 | 0.00 | 2019 | 10 | 190 | Tuesday |